Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors the AIS system testing workflow by extracting FIO-related test steps from the main run_system_tests job into a separate run_FIO_tests job. The motivation (per the PR description) is to eventually use FIO packages produced by ROCm/fio instead of building FIO separately, though this PR currently still builds FIO from source as an intermediate step.
Changes:
- Removed FIO repository checkout and build directory creation from the
run_system_testsjob - Added cleanup steps to the
run_system_testsjob to ensure proper resource cleanup - Created a new
run_FIO_testsjob that checks out the FIO repository and runs FIO-based tests
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| with: | ||
| repository: ROCm/fio | ||
| ref: hipFile | ||
| path: fio |
There was a problem hiding this comment.
The new run_FIO_tests job is missing the checkout of the hipFile repository. The FIO configure step (line 280-281) references HIPFILE=/ais/hipFile and HIPFILELIB paths, and the FIO test steps (lines 305, 316) reference /ais/hipFile/util/fio/write-read-verify.fio. Without checking out the hipFile repository, these paths won't exist and the job will fail. Add a checkout step for the hipFile repository similar to line 48-51 in the run_system_tests job.
| path: fio | |
| path: fio | |
| - name: Fetching hipFile repository... | |
| uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd #v6.0.2 | |
| with: | |
| repository: ROCm/hipFile | |
| path: hipFile |
| - name: Download hipFile runtime package | ||
| uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 #v7.0.0 | ||
| with: | ||
| name: ${{ inputs.ais_hipfile_pkg_filename }} | ||
| - name: Download hipFile development package | ||
| uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 #v7.0.0 | ||
| with: | ||
| name: ${{ inputs.ais_hipfile_pkg_dev_filename }} |
There was a problem hiding this comment.
The new run_FIO_tests job is missing the download of the hipFile build directory artifact. The FIO configure step at line 281 references HIPFILELIB=${HIPFILE}/build/src/amd_detail/ which expects the hipFile build artifacts to be present. Without downloading the hipfile-build-dir artifact (as done in line 60-64 of run_system_tests), the FIO configuration and build will fail. Add a step to download the hipfile-build-dir artifact and copy it into the container similar to lines 60-64 and 133-137 in the run_system_tests job.
| "${AIS_CONTAINER_NAME}" \ | ||
| /bin/bash -c ' | ||
| cp -R /mnt/ais /ais | ||
| mkdir /ais/hipFile/build |
There was a problem hiding this comment.
The directory creation command attempts to create /ais/hipFile/build but the hipFile repository has not been checked out or copied into the container in this job. This will fail because there is no /ais/hipFile directory. After adding the hipFile repository checkout step, ensure it's copied into the container before this step, or remove this line since the hipFile build directory artifact should be downloaded and copied separately (as done in the run_system_tests job at lines 60-64 and 133-137).
| mkdir /ais/hipFile/build |
| if: ${{ always() }} | ||
| run: rm -rf ${GITHUB_WORKSPACE}/* ${GITHUB_WORKSPACE}/.* | ||
| run_FIO_tests: | ||
| runs-on: [linux, AIS] |
There was a problem hiding this comment.
The new run_FIO_tests job is missing a dependency on the run_system_tests job. Without specifying 'needs: run_system_tests', both jobs could run concurrently on the same AIS self-hosted runner, potentially causing resource contention (competing for GPU devices, disk I/O, etc.). Consider adding 'needs: run_system_tests' to ensure sequential execution and avoid resource conflicts, unless parallel execution is intentionally desired.
| runs-on: [linux, AIS] | |
| runs-on: [linux, AIS] | |
| needs: run_system_tests |
This is in preparation for calling out to the ROCm/fio workflow to build and package FIO with hipFile support separately. This workflow would then install the FIO packages produced instead of compiling it separately. This will allow us to at least sanity check the FIO packages we produce and publish.
03afdaa to
a2b2af8
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
.github/workflows/test-ais-system.yml:306
- The PR description states "TBD - Use the FIO packages produced by ROCm/fio instead of building it separately here", but the implementation still builds FIO from source in the
run_FIO_testsjob (lines 283-306). The current changes create abuild_FIOjob that isn't being used, making this change incomplete. The implementation should either: (1) download and use pre-built FIO packages from thebuild_FIOjob, or (2) wait to introduce thebuild_FIOjob until the integration is complete.
- name: Configure fio
run: |
docker exec \
-t \
-w /ais/fio/build \
"${AIS_CONTAINER_NAME}" \
/bin/bash -c '
ROCM=/opt/rocm-${ROCM_VERSION} \
HIPFILE=/ais/hipFile \
HIPFILELIB=${HIPFILE}/build/src/amd_detail/ \
HIP_PLATFORM=amd \
CFLAGS="-I${ROCM}/include" \
LDFLAGS="-L${ROCM}/lib -Wl,-rpath,${ROCM}/lib" \
../configure --enable-libhipfile
'
- name: Build fio
run: |
docker exec \
-t \
-w /ais/fio/build \
"${AIS_CONTAINER_NAME}" \
/bin/bash -c '
make -j
'
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Adds a level of indirection, but we no longer need to worry about | ||
| # adding logic to allow certain jobs to have been skipped depending | ||
| # on the platform of this current workflow, and all others downstream. |
There was a problem hiding this comment.
The comment states "Adds a level of indirection, but we no longer need to worry about adding logic to allow certain jobs to have been skipped depending on the platform of this current workflow, and all others downstream." However, this rationale is unclear. The comment should explain more specifically what problem this indirection solves and what "logic to allow certain jobs to have been skipped" means. Consider expanding this comment to help future maintainers understand the design decision.
| # Adds a level of indirection, but we no longer need to worry about | |
| # adding logic to allow certain jobs to have been skipped depending | |
| # on the platform of this current workflow, and all others downstream. | |
| # Delegate the FIO build to a reusable workflow so that all | |
| # platform-specific build logic (including when to skip or run | |
| # particular build steps for a given OS / platform) is defined | |
| # in a single place. This indirection means this workflow, and | |
| # any downstream workflows that depend on its artifacts, no longer | |
| # need to duplicate complex `if:` conditions or handle cases where | |
| # earlier platform-specific jobs were skipped. |
| build_FIO: | ||
| # Adds a level of indirection, but we no longer need to worry about | ||
| # adding logic to allow certain jobs to have been skipped depending | ||
| # on the platform of this current workflow, and all others downstream. | ||
| uses: ROCm/fio/.github/workflows/build-debian.yml@rildixon/ci-hook-for-hipfile | ||
| with: | ||
| ais_hipfile_pkg_filename: ${{ inputs.ais_hipfile_pkg_filename }} | ||
| ais_hipfile_pkg_dev_filename: ${{ inputs.ais_hipfile_pkg_dev_filename }} | ||
| platform: ${{ inputs.platform }} |
There was a problem hiding this comment.
The build_FIO job has no dependencies on upstream jobs. Since the reusable workflow at ROCm/fio/.github/workflows/build-debian.yml expects ais_hipfile_pkg_filename and ais_hipfile_pkg_dev_filename as inputs, this job should depend on whichever job produces those artifacts (likely needs to be configured at the caller level, similar to how AIS_system_tests depends on build_and_test in ais-ci.yml). Without proper dependencies, the artifacts may not be available when this job runs.
| # Adds a level of indirection, but we no longer need to worry about | ||
| # adding logic to allow certain jobs to have been skipped depending | ||
| # on the platform of this current workflow, and all others downstream. | ||
| uses: ROCm/fio/.github/workflows/build-debian.yml@rildixon/ci-hook-for-hipfile |
There was a problem hiding this comment.
The reusable workflow reference uses a non-standard branch name rildixon/ci-hook-for-hipfile. For production use, this should reference a stable branch (like main, develop, or a tagged version) rather than what appears to be a personal development branch. Using personal branches in production workflows can lead to instability if the branch is deleted or force-pushed.
| uses: ROCm/fio/.github/workflows/build-debian.yml@rildixon/ci-hook-for-hipfile | |
| uses: ROCm/fio/.github/workflows/build-debian.yml@main |
| - name: Fetching fio repository... | ||
| uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd #v6.0.2 | ||
| with: | ||
| repository: ROCm/fio | ||
| ref: hipFile | ||
| path: fio | ||
| - name: Download hipFile runtime package | ||
| uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 #v7.0.0 | ||
| with: | ||
| name: ${{ inputs.ais_hipfile_pkg_filename }} | ||
| - name: Download hipFile development package | ||
| uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 #v7.0.0 | ||
| with: | ||
| name: ${{ inputs.ais_hipfile_pkg_dev_filename }} | ||
| - name: Authenticating to GitHub Container Registry | ||
| uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 #v3.7.0 | ||
| with: | ||
| registry: ghcr.io | ||
| username: ${{ github.actor }} | ||
| password: ${{ secrets.GITHUB_TOKEN }} | ||
| - name: Starting Docker Container | ||
| run: | | ||
| docker run \ | ||
| -dt \ | ||
| --rm \ | ||
| --device=/dev/kfd \ | ||
| --device=/dev/dri \ | ||
| --security-opt seccomp=unconfined \ | ||
| --pull always \ | ||
| -v ${GITHUB_WORKSPACE}:/mnt/ais:ro \ | ||
| -v "${AIS_MOUNT_PATH}:/mnt/ais-fs" \ | ||
| --name "${AIS_CONTAINER_NAME}" \ | ||
| "${AIS_INPUT_CI_IMAGE}" | ||
| - name: Create hipfile IO test directory | ||
| run: | | ||
| docker exec -t "${AIS_CONTAINER_NAME}" /bin/bash -c "mkdir -p /mnt/ais-fs/${AIS_CONTAINER_NAME}" | ||
| - name: Make copy of the code repository and create build directories | ||
| run: | | ||
| docker exec \ | ||
| -t \ | ||
| "${AIS_CONTAINER_NAME}" \ | ||
| /bin/bash -c ' | ||
| cp -R /mnt/ais /ais | ||
| mkdir /ais/fio/build | ||
| ' | ||
| - name: Copy the hipFile packages into the container | ||
| run: | | ||
| docker cp \ | ||
| "${GITHUB_WORKSPACE}/${AIS_INPUT_HIPFILE_PKG_FILENAME}" \ | ||
| "${AIS_CONTAINER_NAME}:/root" | ||
| docker cp \ | ||
| "${GITHUB_WORKSPACE}/${AIS_INPUT_HIPFILE_PKG_DEV_FILENAME}" \ | ||
| "${AIS_CONTAINER_NAME}:/root" | ||
| - name: Install the hipFile packages | ||
| run: | | ||
| docker exec \ | ||
| -t \ | ||
| -w /root \ | ||
| "${AIS_CONTAINER_NAME}" \ | ||
| /bin/bash -c ' | ||
| ${{ | ||
| format( | ||
| env.AIS_PKG_MGR == 'apt' && 'apt install -y "./{0}" "./{1}"' || | ||
| env.AIS_PKG_MGR == 'dnf' && 'dnf install -y --cacheonly "./{0}" "./{1}"' || | ||
| env.AIS_PKG_MGR == 'zypper' && 'zypper --no-refresh install -y --allow-unsigned-rpm "./{0}" "./{1}"' || | ||
| 'echo "Unknown platform."; exit 1', | ||
| inputs.ais_hipfile_pkg_filename, | ||
| inputs.ais_hipfile_pkg_dev_filename | ||
| ) | ||
| }} | ||
| ' |
There was a problem hiding this comment.
The run_FIO_tests job depends on build_FIO but never downloads the artifacts produced by that job. Instead, it checks out the FIO repository again (line 212-217) and builds FIO from source (lines 283-306). This defeats the purpose of having a separate build_FIO job. Either the job should download and use the FIO artifacts from build_FIO, or the build_FIO job and its dependency are unnecessary.
454445d to
89335b4
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
.github/workflows/test-ais-system.yml:303
- The PR description states the motivation as "TBD - Use the FIO packages produced by ROCm/fio instead of building it separately here", suggesting the goal is to use pre-built FIO packages. However, the
run_FIO_testsjob still builds fio from source (steps "Configure fio" and "Build fio" at lines 280-303).
If the intention is to use packages from the build_FIO reusable workflow, there should be steps to download those artifacts instead of building from source. If building from source is still intended, the PR description should be updated to reflect the actual changes being made.
- name: Configure fio
run: |
docker exec \
-t \
-w /ais/fio/build \
"${AIS_CONTAINER_NAME}" \
/bin/bash -c '
ROCM=/opt/rocm-${ROCM_VERSION} \
HIPFILE=/ais/hipFile \
HIPFILELIB=${HIPFILE}/build/src/amd_detail/ \
HIP_PLATFORM=amd \
CFLAGS="-I${ROCM}/include" \
LDFLAGS="-L${ROCM}/lib -Wl,-rpath,${ROCM}/lib" \
../configure --enable-libhipfile
'
- name: Build fio
run: |
docker exec \
-t \
-w /ais/fio/build \
"${AIS_CONTAINER_NAME}" \
/bin/bash -c '
make -j
'
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - name: Set early AIS CI environment variables | ||
| run: echo "AIS_PR_NUMBER=$(echo ${{ github.ref }} | sed 's|[^0-9]||g')" >> "${GITHUB_ENV}" | ||
| - name: Set AIS CI container name | ||
| run: | | ||
| echo "AIS_CONTAINER_NAME=${AIS_PR_NUMBER}_${{ github.job }}_${AIS_INPUT_PLATFORM}_${AIS_INPUT_ROCM_VERSION}" >> "${GITHUB_ENV}" | ||
| # hipFile repo needed for //util directory | ||
| - name: Fetching hipFile repository... | ||
| uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd #v6.0.2 | ||
| with: | ||
| path: hipFile | ||
| - name: Fetching fio repository... | ||
| uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd #v6.0.2 | ||
| with: | ||
| repository: ROCm/fio | ||
| ref: hipFile | ||
| path: fio | ||
| - name: Download hipFile runtime package | ||
| uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 #v7.0.0 | ||
| with: | ||
| name: ${{ inputs.ais_hipfile_pkg_filename }} | ||
| - name: Download hipFile development package | ||
| uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 #v7.0.0 | ||
| with: | ||
| name: ${{ inputs.ais_hipfile_pkg_dev_filename }} | ||
| - name: Authenticating to GitHub Container Registry | ||
| uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 #v3.7.0 | ||
| with: | ||
| registry: ghcr.io | ||
| username: ${{ github.actor }} | ||
| password: ${{ secrets.GITHUB_TOKEN }} | ||
| - name: Starting Docker Container | ||
| run: | | ||
| docker run \ | ||
| -dt \ | ||
| --rm \ | ||
| --device=/dev/kfd \ | ||
| --device=/dev/dri \ | ||
| --security-opt seccomp=unconfined \ | ||
| --pull always \ | ||
| -v ${GITHUB_WORKSPACE}:/mnt/ais:ro \ | ||
| -v "${AIS_MOUNT_PATH}:/mnt/ais-fs" \ | ||
| --name "${AIS_CONTAINER_NAME}" \ | ||
| "${AIS_INPUT_CI_IMAGE}" | ||
| - name: Create hipfile IO test directory | ||
| run: | | ||
| docker exec -t "${AIS_CONTAINER_NAME}" /bin/bash -c "mkdir -p /mnt/ais-fs/${AIS_CONTAINER_NAME}" | ||
| - name: Make copy of the code repository and create build directories | ||
| run: | | ||
| docker exec \ | ||
| -t \ | ||
| "${AIS_CONTAINER_NAME}" \ | ||
| /bin/bash -c ' | ||
| cp -R /mnt/ais /ais | ||
| mkdir /ais/fio/build | ||
| ' | ||
| - name: Copy the hipFile packages into the container | ||
| run: | | ||
| docker cp \ | ||
| "${GITHUB_WORKSPACE}/${AIS_INPUT_HIPFILE_PKG_FILENAME}" \ | ||
| "${AIS_CONTAINER_NAME}:/root" | ||
| docker cp \ | ||
| "${GITHUB_WORKSPACE}/${AIS_INPUT_HIPFILE_PKG_DEV_FILENAME}" \ | ||
| "${AIS_CONTAINER_NAME}:/root" | ||
| - name: Install the hipFile packages | ||
| run: | | ||
| docker exec \ | ||
| -t \ | ||
| -w /root \ | ||
| "${AIS_CONTAINER_NAME}" \ | ||
| /bin/bash -c ' | ||
| ${{ | ||
| format( | ||
| env.AIS_PKG_MGR == 'apt' && 'apt install -y "./{0}" "./{1}"' || | ||
| env.AIS_PKG_MGR == 'dnf' && 'dnf install -y --cacheonly "./{0}" "./{1}"' || | ||
| env.AIS_PKG_MGR == 'zypper' && 'zypper --no-refresh install -y --allow-unsigned-rpm "./{0}" "./{1}"' || | ||
| 'echo "Unknown platform."; exit 1', | ||
| inputs.ais_hipfile_pkg_filename, | ||
| inputs.ais_hipfile_pkg_dev_filename | ||
| ) | ||
| }} | ||
| ' |
There was a problem hiding this comment.
The run_FIO_tests job duplicates significant infrastructure setup code from the run_system_tests job (approximately 80 lines of identical code including Docker setup, authentication, container startup, package installation, etc.). This duplication makes the workflow harder to maintain - any changes to the infrastructure setup would need to be made in both places.
Consider extracting the common setup steps into a reusable workflow or composite action that both jobs can use. At minimum, ensure that any future changes to Docker setup, package installation, or cleanup steps are synchronized between both jobs.
| build_FIO: | ||
| uses: ROCm/fio/.github/workflows/build-fio.yml@rildixon/ci-hook-for-hipfile | ||
| with: | ||
| ais_hipfile_pkg_filename: ${{ inputs.ais_hipfile_pkg_filename }} | ||
| ais_hipfile_pkg_dev_filename: ${{ inputs.ais_hipfile_pkg_dev_filename }} | ||
| platform: ${{ inputs.platform }} |
There was a problem hiding this comment.
The build_FIO job references a reusable workflow from an external repository branch (rildixon/ci-hook-for-hipfile). This introduces several concerns:
- Using a personal branch as a dependency creates instability - if the branch is deleted, renamed, or force-pushed, this workflow will break
- There's no verification that the called workflow will produce artifacts needed by
run_FIO_tests - The
build_FIOjob doesn't specifyneedsdependencies, but it requires artifacts (hipFile packages) that should be produced by a prior build job
Consider either:
- Using a stable tag/release of the ROCm/fio repository instead of a personal branch
- Or ensuring the branch is merged and using the default branch
- Adding explicit
needsdependencies if the build_FIO job requires artifacts from other jobs
Motivation
TBD - Use the FIO packages produced by ROCm/fio instead of building it separately here.
Technical Details
Test Plan
Test Result
Submission Checklist
AIHIPFILE-121